NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sustaining Human Agency, Attending to Its Cost: An Investigation into Generative AI Design for Non-Native Speakers' Language Use

https://doi.org/10.1145/3706598.3713626

Xiao, Yimin; Hancock, Cartor; Agrawal, Sweta; Mehandru, Nikita; Salehi, Niloufar; Carpuat, Marine; Gao, Ge (April 2025, ACM)

Full Text Available
Sustaining Human Agency, Attending to Its Cost: An Investigation into Generative AI Design for Non-Native Speakers' Language Use

Xiao, Yimin; Hancock, Cartor; Agrawal, Sweta; Mehandru, Nikita; Salehi, Niloufar; Carpuat, Marine; Gao, Ge (April 2025, arXiv)

Full Text Available
Exploring Videoconferencing for Older Adults with Cognitive Concerns Using a Dramaturgical Lens

https://doi.org/10.1145/3663548.3675647

Hu, Ruipu; Gao, Ge; Lazar, Amanda (October 2024, ACM)

While videoconferencing is a promising technology, it may present unique challenges and barriers for older adults with cognitive concerns. This paper presents a deconstructed view of videoconferencing technology use using a sociological dramaturgical framework developed by Erving Goffman. Our study recruited 17 older adults with varying cognitive concerns, employing technology discussion groups, interviews, and observations to gather data. Through a reflexive thematic analysis, we explore videoconferencing use among older adults with cognitive concerns, focusing on three major areas: the "performances and roles" where users adapt to new roles through videoconferencing; the "backstage," which involves the physical and logistical setup; and the "frontstage," where people communicate through audio and visual channels to present a desired impression. Our discussion generates insights into how deconstructing these elements can inform more meaningful and accessible HCI design.
more » « less
Full Text Available
Aligning LLM Agents by Learning Latent Preference from User Edits

Gao, Ge; Taymanov, Alexey; Salinas, Eduardo; Mineiro, Paul; Misra, Dipendra (December 2024, NeurIPS)

We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data. The inferred user preference descriptions are used to define prompts for generating responses in the future. This avoids fine-tuning the agent, which is costly, challenging to scale with the number of users, and may even degrade its performance on other tasks. Furthermore, learning descriptive preference improves interpretability, allowing the user to view and modify the learned preference. However, user preference can be complex, subtle, and vary based on context, making it challenging to learn. To address this, we propose a simple yet effective algorithm named CIPHER that leverages the LLM to infer the user preference for a given context based on user edits. In the future, CIPHER retrieves inferred preferences from the k-closest contexts in the history, and forms an aggregate preference for response generation. We introduce two interactive environments -- summarization and email writing, and use a GPT-4 simulated user for evaluation. On both tasks, CIPHER outperforms several baselines by achieving the lowest edit distance cost while only having a small overhead in LLM query cost. Our analysis reports that user preferences learned by CIPHER show significant similarity to the ground truth latent preferences.
more » « less
Full Text Available
I Could’ve Asked That: Reformulating Unanswerable Questions

Zhao, Wenting; Gao, Ge; Cardie, Claire; Rush, Alexander M (November 2024, EMNLP)

When seeking information from unfamiliar documents, users frequently pose questions that cannot be answered by the documents. While existing large language models (LLMs) identify these unanswerable questions, they do not assist users in reformulating their questions, thereby reducing their overall utility. We curate CouldAsk, an evaluation benchmark composed of existing and new datasets for document-grounded question answering, specifically designed to study reformulating unanswerable questions. We evaluate state-of-the-art open-source and proprietary LLMs on CouldAsk. The results demonstrate the limited capabilities of these models in reformulating questions. Specifically, GPT-4 and Llama2-7B successfully reformulate questions only 26% and 12% of the time, respectively. Error analysis shows that 62% of the unsuccessful reformulations stem from the models merely rephrasing the questions or even generating identical questions. We publicly release the benchmark and the code to reproduce the experiments.
more » « less
Full Text Available
Off-Policy Evaluation for Human Feedback

Gao, Qitong; Gao, Ge; Dong, Juncheng; Tarokh, Vahid; Chi, Min; Pajic, Miroslav (December 2024, The Thirty-Eighth Annual Conference on Neural Information Processing Systems)

Full Text Available
Using Personal Exposure Measurement to Manage Environmental Stressors

https://doi.org/10.63044/w25and110

Andrews, Clinton; Shahid, Yousaf; Andrews, Abigail; Gao, Ge; Gong, Jie; Josephs, Holly; Kim, Sunyoung; Li, Yitong; Maddila, Vijay; Mainelis, Gediminas; et al (May 2025, ASHRAE)

Personal exposures to environmental stressors including extreme heat and air pollution vary widely depending on schedules and activities. This paper shares results of a city-scale project to build fixed indoor and outdoor sensor networks while also deploying mobile sensors. The network helps building occupants, building operators, and public officials to safely manage extreme heat and air pollution. The Exposure Duration Curve (EDC) concept is introduced to facilitate comparisons.
more » « less
Full Text Available
On Trajectory Augmentations for Off-Policy Evaluation

Gao, Ge; Gao, Qitong; Yang, X; Ju, S; Pajic, Miroslav; Chi, Min (April 2024, 12th International Conference on Learning Representations (ICLR))

Full Text Available
I Could’ve Asked That: Reformulating Unanswerable Questions

https://doi.org/10.18653/v1/2024.emnlp-main.242

Zhao, Wenting; Gao, Ge; Cardie, Claire; Rush, Alexander M (January 2024, Association for Computational Linguistics)

Full Text Available
Policy-Gradient Training of Language Models for Ranking

Gao, Ge; Chang, Jonathan D; Cardie, Claire; Brantley, Kianté; Joachims, Thorsten (December 2023, NeurIPS Workshop on Foundation Models for Decision Making)

Full Text Available

« Prev Next »

Search for: All records